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Abstract 

We present an optimal, combinatorial 1 — 1/e approximation algorithm for monotone sub- 
modular optimization over a matroid constraint. Compared to the continuous greedy algorithm 
(Calineseu, Chekuri, Pal and Vondrak, 2008), our algorithm is extremely simple and requires 
no rounding. It consists of the greedy algorithm followed by local search. Both phases are run 
not on the actual objective function, but on a related non-oblivious potential function, which 
is also monotone submodular. Our algorithm runs in randomized time 0{ti'^u^), where n is 
the rank of the given matroid and u is the size of its ground set. We additionally obtain a 
1 — 1/e — 0{e) approximation algorithm running in randomized time 0(e~^n^u). For matroids 
in which n = o(m), this improves on the runtime of the continuous greedy algorithm. The 
improvement is due primarily to the time required by the rounding phase, which we avoid alto- 
gether. Furthermore, the independence of our algorithm from rounding techniques suggests that 
our general approach may be helpful in contexts such as monotone submodular maximization 
subject to multiple matroid constraints. 

In our previous work on maximum coverage (Filmus and Ward, 2011), the potential function 
gives more weight to elements covered multiple times. We generalize this approach from cov- 
erage functions to arbitrary monotone submodular functions. When the objective function is a 
coverage function, both definitions of the potential function coincide. The parameters used to 
define the potential function are closely related to Fade approximants of evaluated at a; = 1. 
We use this connection to determine the approximation ratio of the algorithm. 

Our approach generalizes to the case where the monotone submodular function has restricted 
curvature. For any curvature c, we adapt our algorithm to produce a (1 — e~'^) / c approximation. 
This result complements results of Vondrak (2008), who has shown that the continuous greedy 
algorithm produces a (1 — e~'^)/c approximation when the objective function has curvature c. 
He has also proved that achieving any better approximation ratio is impossible in the value 
oracle model. 

This paper supersedes the authors' FOCS 2012 paper. A reworking of the proof can he found 
on the first author's homepage. 
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1 Introduction 



In this paper, we consider the problem of maximizing a monotone submodular fmiction /, subject 
to a single matroid constraint. Formally, let U he a set of elements and let /: 2^ — )■ M>o be a 
function assigning a value to each subset of U. We say that / is submodular if 

fiA) + f{B)>fiAuB) + fiAnB) 

for all A,B <^U. If additionally, f{A) < f{B) whenever A Q B, we say that / is monotone 
submodular. Submodular functions exhibit (and are, in fact, alternately characterized by) the 
property of diminishing returns — if / is submodular then f{A U {x}) — f{A) < f{B U {2;}) — f{B) 
for all B Q A. Hence, they are useful for modeling various economic and game-theoretic scenarios, as 
well as various combinatorial problems. In a general monotone submodular maximization problem, 
we are given a value oracle for / and a membership oracle for some distinguished collection X C 2^ 
of feasible sets, and our goal is to find a member of I that maximizes the value of /. We assume 
further that / is normalized so that /(0) = 0. 

We consider the restricted setting in which the collection I is a matroid. Matroids are inti- 
mately connected to combinatorial optimization: the problem of optimizing a linear function over a 
hereditary set system (a set system closed under taking subsets) is solved optimally for all possible 
functions by the standard greedy algorithm if and only if the set system is a matroid [271 E] • 

In the case of a monotone submodular objective function, the standard greedy algorithm, which 
takes at each step the element yielding the largest increase in / while maintaining independence, 
is (only) a 2-approximation [15]. Recently, Calinescu et al. [5l [281 [6] have developed a (1 — 1/e)- 
approximation for this problem via the continuous greedy algorithm, which is essentially a steepest 
descent algorithm running in continuous time (in practice, a suitably discretized version is used), 
producing a fractional solution. This solution is rounded using pipage rounding [l] or swap rounding 

m- 

Feige ^ shows that improving the bound (1 — 1/e) is NP-hard. Nemhauser and Wolsey |2^ 
show that any improvement over (1 — 1/e) requires an exponential number of queries in the value 
oracle setting. 

Following Vondrak [29], we also consider the case when / has restricted curvature. We say that 
/ has restricted curvature c with respect to I if for any two non-empty and disjoint independent 
sets A, B, 

f{AuB)>f{A) + {l-c)f{B). 

When c = 1, this is a restatement of monotonicity. Vondrak [29] has shown that the continuous 
greedy algorithm produces a (1 — e~^)/c approximation when / has restricted curvature c. Fur- 
thermore, he has shown that any improvement over (1 — e~'^)/c requires an exponential number of 
queries in the value oracle setting. 

Our definition of curvature differs from the usual (non-restricted) definition of curvature in that 
we only consider sets A and B that are independent, while the general definition requires that the 
above inequality hold for all (not necessarily independent) sets A and B. Therefore, every function 
with curvature at most c also has restricted curvature at most c and so our algorithm applies to a 
wider class of functions than those with non-restricted curvature at most c. 
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1.1 Our contribution 



In this paper, we propose a conceptually simple randomized polynomial time local search algorithm 
for the problem of monotone submodular matroid maximization. Like the continuous greedy algo- 
rithm, our algorithm delivers the optimal (1 — l/e)-approximation. However, unlike the continuous 
greedy algorithm, our algorithm is entirely combinatorial, in the sense that it deals only with inte- 
gral solutions to the problem and hence involves no rounding procedure. As such, we believe that 
the algorithm may serve as a gateway to further improved algorithms in contexts where pipage 
rounding and swap rounding break down, such as submodular maximization subject to multiple 
matroid constraints. 

Our main results are a combinatorial 1 — 1/e — e approximation algorithm for monotone submod- 
ular matroid maximization, running in randomized time 0(e~'^n^u) and a combinatorial 1 — 1/e 
approximation algorithm running in randomized time O(n^it^), where n is the rank of the given 
matroid and u is the size of its ground set. We compare the runtime of our algorithms to the 
continuous greedy algorithm (our estimate of the continuous greedy algorithm's runtime appears 
in Appendix |B]) . We show that, while our algorithms' runtimes are a factor of 0(n) greater than 
the continuous greedy algorithm's initial continuous greedy phase, they have greatly improved de- 
pendence on u, when compared to the continuous greedy algorithm's rounding phase (either pipage 
rounding or swap roundingj^. Thus, our algorithms attain a significant runtime improvement over 
the continuous greedy algorithm when n = o{u). 

Our algorithm further generalizes to the case in which the submodular function has restricted 
curvature c. In this case the approximation ratios obtained are (1 — e~^)/c — e and (1 — e~'^)/c, 
respectively, again matching the performance of the continuous greedy algorithm |29j . Unlike the 
continuous greedy algorithm, our algorithm requires knowledge of c. However, by enumerating over 
values of c we are able to obtain a combinatorial (1 — e~^)/c algorithm even in the case that /'s 
curvature is not given (assuming it is bounded away from zero). Vondrak's make use of restricted 
notions of curvature, and so apply to a class of functions slightly larger than those of curvature at 
most c. 

Our algorithmic approach is based on non-oblivious local search, a technique first proposed by 
Alimonti |2j and by Khanna, Motwani, Sudan and Vazirani [20]. In classical (or, oblivious) local 
search, the algorithm starts at an arbitrary solution, and proceeds by iteratively making small 
changes that improve the objective function, until no such improvement can be made. The locality 
ratio of a local search algorithm is min /(S')//(0), where S" is a solution that is locally-optimal 
with respect to the small changes considered by the algorithm, O is a global optimum, and / is the 
objective function. The locality ratio provides a natural, worst-case guarantee on the approximation 
performance of the local search algorithm. 

In many cases, oblivious local search may have a very poor locality ratio, implying that a locally- 
optimal solution may be of significantly lower quality than the global optimum. For example, for 
submodular matroid maximization, the locality ratio for an algorithm changing a single element at 
each step is 1/2 [T5]. Non-oblivious local search attempts to avoid this problem by making use of 
a secondary potential function to guide the search. By carefully choosing this auxiliary function, 
we ensure that poor local optima with respect to the original objective function are no longer local 
optima with respect to the new potential function. 

In previous work [H], we designed an optimal non-oblivious local search algorithm for the 

^The basic reason behind this dependence on u is that the fractional solution in general has full support. Our 
intermediate solutions, being bona fide bases, always have support of size n. 
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restricted case of maximum coverage subject to a matroid constraint. In this problem, we are 
given a weighted universe of elements, a collection of sets, and a matroid defined on this collection. 
The goal is to find a collection of sets that is independent in the matroid and covers elements of 
maximum total weight. The non-oblivious potential function used in |14] gives extra weight to 
solutions that cover elements multiple times. In the present work, we extend this approach to 
general monotone submodular functions. This presents two challenges: defining a non-oblivious 
potential function without reference to the coverage representation, and analyzing the resulting 
algorithm. 

In order to define the general potential function, we construct a variant of the potential function 
from jl4j which doesn't refer to elements. Instead, the potential function aggregates the information 
obtained by applying the objective function to all subsets of the input, weighted according to their 
size. Intuitively, the resulting potential function gives extra weight to solutions that contain a large 
number of good sub-solutions, or equivalently, remain good solutions on average even when elements 
are randomly removed. An appropriate setting of the weights defining our potential function yields 
a function which coincides with the previous definition for coverage functions, but still makes sense 
for arbitrary monotone submodular functions. 

The analysis of the algorithm in [M] is relatively straightforward. For each type of element in 
the universe of the coverage problem, we must prove a certain inequality among the coefficients 
defining the potential function. In the general setting, however, we need to construct a proof using 
only the inequalities given by monotonicity and submodularity. The resulting proof is non-obvious 
and delicate. For the proof to work, a certain sequence defined by a recurrence relation needs to be 
non-decreasing. The locality gap can then be read off the sequence. We describe a way to construct 
the sequence using the recurrence relation in such a way that it is non-decreasing. In order to show 
that the resulting performance ratio at least 1 — 1/e, we use an explicit formula for the sequence 
in terms of Fade approximants of e^. 

1.2 Related work 

Fisher, Nemhauser and Wolsey [25^ I15j analyze greedy and local search algorithms for submodular 
maximization subject to various constraints, including single and multiple matroid constraints, and 
obtain some of the earliest results in the area, including a l/{k + l)-approximation for monotone 
submodular maximization subject to k matroid constraints. A recent survey by Goundan and 
Schulz [17] reviews many results pertaining to the greedy algorithm for submodular maximization. 

More recently, Lee, Sviridenko and Vondrak j23| consider the problem of both monotone and 
non-monotone submodular maximization subject to multiple matroid constraints, attaining a l/{k+ 
e)-approximation for monotone submodular maximization subject to /c > 2 constraints using local 
search. Feldman et al. [1^ show that a local search algorithm attains the same bound for the 
related class of /c-exchange systems, which includes the intersection of k strongly base orderable 
matroids, as well as the independent set problem in l)-claw free graphs. Further work by Ward 
|30j shows that a non-oblivious local search routine attains an improved 2/(fc-|-3) — e approximation 
for this class of problems. 



In the case of unconstrained non-monotone maximization, Feige, Mirrokni and Vondrak [TO] 

give a 2/5 approximation via a randomized local search algorithm, and give an upper bound of 1/2 
in the value oracle model. Gharan and Vondrak [16] improved the algorithmic result to 0.41 by 
enhancing the local search algorithm with ideas borrowed from simulated annealing. Feldman, Naor 
and Schwarz [12] later improved this to 0.42 by using a variant of the continuous greedy algorithm. 
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Buchbinder, Feldman, Naor and Schwarz have recently obtained an optimal 1/2 approximation 
algorithm [1]. 

In the setting of constrained non-monotone submodular maximization, Lee et al. |22] give a 
1/ {k + 2 + l/k + e) approximation subject to k matroid constraints and a 1/(5 — e) approximation for 
k knapsack constraints. Further work by Lee, Sviridenko and Vondrak [23] improves the approxi- 
mation ratio in the case of k matroid constraints to 1/(A: + 1 + l/(fc — 1) + e). Feldman et al. |13] 
attain this ratio for /c-exchange systems. In the case of non-monotone submodular maximization 
subject to a single matroid constraint, Feldman, Naor and Schwarz |11] obtain a 1/e approximation 
by using a version of the continuous greedy algorithm. They additionally unify various applica- 
tions of the continuous greedy and obtain improved approximations for non-monotone submodular 
maximization subject to a matroid constraint or 0(1) knapsack constraints. 

1.3 Organization of the paper 

The rest of the paper is organized as follows. In Section [2] we give the definition of our algorithm, 
and present a high-level overview of the proofs and concepts used in its analysis. Section [3] defines 
the non-oblivious potential functions g and g used by our algorithm and provides some of their 
properties. Section [H determines the locality gap and runtime of the resulting algorithm, together 
with some improvements on it. Appendix |A] provides some intuition for the definition of g by show- 
ing that g agrees with the non-oblivious potential function defined in [14] when c = 1. Appendix iBl 
gives a detailed analysis of the running time of the continuous greedy algorithm for the sake of 
comparison!! The rest of the appendices contain proofs of various results claimed in the main body 
of the paper. 

2 The algorithm 

Our non-oblivious local search algorithm is shown in Algorithm [TJ The input to the algorithm is 
a matroid 7W, a monotone submodular function /, an upper bound on its restricted curvature c 
(we assume < c < 1), an error parameter eo, and a sampling parameter A^. The matroid M. is 
given as a universe U and a collection of independent sets T C 2^, itself given as an independence 
oracle (an oracle deciding whether S € I for an arbitrary subset S (^U). Throughout the paper, 
we let n denote the rank of Ai and u = \U\. The reader interested only in monotone submodular 
maximization, without any restriction on the curvature, can substitute 1 for c in all that follows. 

In Theorem 14.101 we show how to choose eo and A^ in order to obtain (with high probability) 
an approximation ratio as close to (1 — e~'^)/c as desired. Theorem 14.121 shows how to use the 
algorithm as a black-box in order to remove the small extra factor e from the approximation ratio 
to yield a clean (1 — e~'^)/c-approximation algorithm, given c. Theorem 14.131 shows how to use 
the algorithm as a black-box to yield a ((1 — e~'^)/c — e)-approximation algorithm even without 
knowing c in advance. 

We define the function g in Section [3l The function is defined by a sequence of coefficients 
obtained from an auxiliary sequence 7^™^ , depending on the rank of the given matroid. Because our 

^Because the rounding part of the continuous greedy algorithm rehes on several components, including submod- 
ular minimization (pipage rounding) or an efficient implementation of Caratheodory's theorem in the matroid base 
polytope (swap rounding), it is difficult to give a general runtime for it. We have tried to be as generous as possible 
in our analysis. 
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non-oblivious potential function g requires an exponential number of value queries to / to compute 
exactly, Algorithm [T] makes use of an approximation g to g, obtained by evaluating / taking N 
samples. We define the random process and sampling used to compute g in Lemma 13.31 



Input: M = {U,I), f, c, eo, N 

Determine the rank of A4 and calculate coefficients for g; 

Let g be the approximation to g obtained by using N random samples; 

Let S'init be the result of running the standard greedy algorithm on (^A,g)■, 

S 'S'init i 

repeat 

foreach element e £ S and x (^U \ S do 
5' ^ 5 \ {e} U {x}; 

if S' el and g{S') > (1 + eo)g{S) then 
S ^S'; 
break; 

until No exchange is made; 
return S; 

Algorithm 1: The non-oblivious local search algorithm 

The purpose of the greedy phase is to produce a set S'init with non-negligible g{Sinit)- This 
will allow us to bound the number of iterations in the main loop. Instead of running the greedy 
phase with the function g, we could also run it with /, obtaining marginally inferior results. An 
even simpler approach is to "guess" a set 5i such that g{{Si}) > max g{S)/n, at a multiplicative 
0(n log n) cost in the runtime. 



3 The non-oblivious potential function 

We now define our non-oblivious potential function g, which will depend on the value of the pa- 
rameter c. As in the coverage case, our goal is to somehow give extra weight to solutions that will 
have more flexibility in future iterations of the local search procedure. One way to do this is to 
incorporate the value of all subsets of a solution S into the value of g{S). Then, we can give some 
extra value to solutions that contain a large number of good sub-solutions. 

With this approach in mind, we define g{S) to be a combination of the values f{T) for all 
T C 5. The extra weight that a subset T contributes will depend on both its size and the size of 
the solution S on which we are evaluating g. Our function g has the general form 

^(^) = E E %wf(T) = E J., f(n (1) 

k=i re(f) ^ ^ ^ '==1 ^''^ 



Here, -jfsjr- > is the weight given to the value /(T) on any subset T of size k. Alternatively, 

we can think of g in terms of the following random process. First, we choose a value k between 

(\s\) 

1 and \S\ with probability proportional to . Then, we choose k items randomly from S to 
obtain a subset T. The value of g is then (up to a multiplicative constant) precisely the expected 
value of / on the resulting random subset T. 
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Note that the coefficients /3 depend on the size of |5| — that is, we have a separate sequence 

of coefficients , • • • , /3m"'' for each vakie of m (where m corresponds to the size of the set S 
on which g is being evakiated). The local search phase of Algorithm [1] as well as the proof of 
Theorem 14.61 which bounds the locality gap of g, requires only that g be defined for sets of size n. 
The analysis of the initial greedy phase requires that g be defined for sets of size to 2n, and be 
monotone submodular. In order to show that g is monotone submodular, and to relate g to the 
potential function obtained in [Hj, we require a sequence of coefficients for each m. 

In order to complete our definition of the non-oblivious local search algorithm, we must "only" 
specify appropriate values for these coefficients. We now turn to this task. In later proofs, it will 
be more convenient to work with expressions of the form 7^"^^ = As we have noted, in the 

execution and analysis of Algorithm [1] we need only consider sets of size at most 2n (the initial 
greedy phase considers sets of size at most |Ou5|). Although they are not needed by the algorithm, 
our analysis will make use of additional coefficients 'y^^ and for each value of m > 0, and 

additional sequences 7™ for ni > 2n. 

We now give a formal construction for the necessary coefficient sequences. Let g = 2 [^J + 1. 

We set 7g^"'' = 7g^i = 1 and define the rest of the sequence 7(2"-) = -y^"'^ , • • • , 72n+i using the 
recurrence 

^^(^J = (2£ - m + c - 1)7^""^ + {m-£ + 1)7^™) , l<£<m (7-REC) 

with m = 2n. We normalize the resulting sequence by dividing it by y^"'^ ■ Note that the normalized 
sequence still satisfies recurrence ( j7-RECi ), but additionally has 7o^"^ = 1. Then, we generate the 
sequences 7'-'"^ for all 1 < m < 2n and m > 2n from the resulting sequence by repeated use of the 
downward and upward recurrences 

e 

(m— 1) (m) , C \ ^ (m) / „ \ 

Ti =7o +-2^7fc (7-DOWN) 

k=l 



Lemma 3.1. The sequences 7*-™^ = 7o , • • • ,7m+i; satisfy the following properties: 

(a) 7('") satisfies recurrence ( i7-REC| ) for all m > 1. 

(b) 7^™) is non- decreasing and non-negative for < m < 2n. 

(c) 7(™~^) and 7^"^) satisfy both ( |7-DOWNj ) and i \y-VP\) for all m>l. 

We defer the proof of the lemma to Appendix[Cj We comment, without proof, that 7^,™^ = exp :^^^ + 
0{c/m). 

Many of our proofs will require bounding the value of the largest term 7^^;^ in the resulting 
sequences. According to the last case of ( |7-upj ), the sequences for < m < 2re all have the same 
value for this term, which is thus some constant E independent of m. In order to do this, we derive 
an explicit formula for each sequence: 
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Lemma 3.2. For all m >0, the terms of j^"^^ are given by the following formula: 



fc=0 




(■ 



m — 1 — k 

e-k 



E - 



m — 1 — k 

e 



) 



(7-CL0SED) 



where E = Tm+i 



and < £ < m, — 1. 



Using the resulting sequences to set fij^ = Tk 1^ ™ ©' complete the definition of our 
potential function. We now consider some of the properties of the resulting function g. As we show 
in Appendix ini if / is monotone submodular, g is monotone submodular when restricted to sets of 
size at most 2n. Moreover, g has slightly lower curvature than /. Unfortunately, evaluating g{S') 
requires evaluating / on all subsets of S. Hence, we cannot compute g directly without using an 
exponential number of calls to the value oracle /. In our original description of (7, we provided 
some intuition in terms of a random process. We now formalize this intuition to show that g can 
be approximated by sampling. 

Suppose that |5| = m < 2n, and define = SfcLi/^i"^^- We define the random set X using 
the following two step experiment. First, let L be a random variable taking value k with probability 
ji^^^ joLm^ then choose X as a uniformly random subset of S of size L. Then, from the linearity of 
expectations we have amIE/(X) = 5(5). We now estimate the error incurred when g is estimated 
by taking N samples. The following lemma is proved in Appendix lEl 

Lemma 3.3. Let S be a set of size m. Let N be a positive integer, and Xi, . . . ^X^ be N i.i.d. 

random samples drawn from the distribution for X . Define 9 = Sil=i o:mf{Xi). For every e > 0, 



In order to obtain a polynomial time algorithm, we use g in Algorithm [T] rather than g, where 
the number of samples is a parameter to the algorithm. 

4 Analysis of the algorithm 

In this section, we derive a bound on the approximation ratio and running time of Algorithm [TJ 
Subsection 14.11 calculates (in Theorem 14. 6p the locality gap of the local search phase in terms of the 
7 sequences used to define g. Subsection 14.21 deduces (in Theorem I4.10p the approximation ratio 
of Algorithm [U and estimates its running time. Subsection 14.31 shows (in Theorems 14.121 and 14. 13p 
how to obtain a clean (1 — e~^)/c approximation (given c), and how to obtain a (1 — e^^)/c — e 
approximation without knowing c in advance. 

4.1 Locality gap 

First, we establish some basic terminology used in this section. 

Definition 4.1 (Global and Approximate Local Optima). Let (Ai = (V{,I),f) be an instance of 
monotone submodular maximization over a matroid, where U is the ground set, I is the family of 
independent sets of a matroid, and / is a (normalized) monotone submodular function on 2^. 
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An independent set O is a global optimum or optimal solution if for every A £l, f{0) > f{A). 
Since / is monotone, there is always an optimal solution which is a base (maximal independent 
set) of the matroid. We henceforth consider only global optima that are bases. 

Let e > 0. A base S is an e-approximate local optimum if for every a G S" and for every b & U 
such that 5 \ {a} U {6} G I, 

{l + e)g{S)>g{S\{a}U{b}). 

Consider an instance {A4,f) with optimal solution O (without loss of generality, a base), and 
let S be the solution produced by the algorithm. Below, in Subsection 14.21 we show that S is an 
ei-approximate local optimum of g, where ei = 0(eo). Our bound, stated in Theorem 14.61 uses the 
approximate local optimality of g{S) to bound the quality of f{S) in relation to f{0). 

Since both O and S are bases, a theorem of Brualdi [3] shows that there exists a bijection 
vr: 5 — )• O such that S \ {x} U {7r(x)} is a base of Ai for all x G S. We index the elements of 
5 = {si, . . . , Sn} arbitrarily, and then for each element Si G S, we define Oj = 7r(sj). For a set of 
indices / C [n], we use the notation S[ (respectively, O/) to denote the set {si : i £ 1} (respectively, 
{oj : i G /}). The notation [n] itself is shorthand for {1, . . . ,n}. Note that this bijection, used here 
to index the sets of O and S, is essentially the only property of matroids that we require for the 
remainder of our analysis. 

It will be convenient to work with the following symmetric notation. Let I, b,g be non-negative 
integers satisfying l + b < n and g+b < n. Then, we define Xi^f,^g to be the multiset of sets (Sl^Og) 
for all distinct L, G, satisfying \L\ = I + b, \G\ = g + b, \Lr\G\ = b. That is, Xi^^^g is the collection 
of all sets containing / + b elements from S, and g + b elements from O, where b of the elements 
have the same index in both S and O. Note that if we have some element x in S Ci O, then sets 
containing x will appear multiple times in Xi^^^g, once when x is treated as an element of S and 
once when it is treated as an element of O. Hence, we have l-'^/.fe.gl = (") ("g^') We define 

Pi,b,g to be the expected value of a uniformly random set in Xi^h,g- 

Fi,,,g = j^ f{a). 

We adopt the convention that = if one of /, 6, (7 is negative. 

The proof of Theorem 14.61 makes use of several inequalities following from local optimality, the 
definition of g and the submodularity of /. Our first ancillary inequality simply re-expresses the 
(approximate) local optimality of S in our symmetric notation: 

Lemma 4.2. 

n n 
fc=l k=l 

Proof. Note that S \ {sj} U {oj} is a base for all 1 < i < n. Since S is an ei-approximate local 
optimum, we must have (1 -|- ei)g{S) > g{S \ {si} U {oj}) for all such i. Summing over 1 < i < n 
gives: 

n 

eing{S) + ng{S) -Y,9{S\ {s,} U {o,}) > 0. (2) 
1=1 

From the definition of g and F, we have g{S) = 'Y^=iP^k^ Fj^fifi- We now focus on the final 
summation in inequality ([2|). Consider an arbitrary set in X^fifl- This set has the form Si where 



8 



\I\ = k, and appears as a subset of S \ {sj} U {oj} for each value of i ^ I. Thus, it appears in the 
sum (n — k) times, each with weight = ^ x^,o ol ' coefficient of -Ffe^o.o hi the sum is 

(n — k)(3^\ Now, consider an arbitrary set in Xk-ifl^i- This set has the form Si U Oi for some I 
with |/| = k — 1 and i ^ I. Each such set appears as a subset of 5 \ {sj} U {oj} for exactly one value 

of i and so appears in the sum once, with weight -tttt = = rj^ — - — r- Hence, the coefficient of 

(k) 

Fk-ifl,i in the sum is kfin ■ No other sets appear in the sum. It follows that the final summation 
is equivalent to 

n 
k=l 

and the claim follows from the definition 7^"^ = kf5^\ □ 

Our remaining lemma follow directly from the monotonicity and submodularity of /. 
Lemma 4.3. For I satisfying < i < n, 

(n - i)Fe,o,i + ^^£-1,0,1 > ^^^-1,0,0 + {n - i - c)F^,o,o + /(O). 

In order to prove Lemma 14.31 we first prove two smaller inequalities following from submodu- 
larity and monotonicity of /. 

Lemma 4.4. For £ satisfying < £ < n, (n — ^)-F£,o,i > {n — i — l)F^^o,o + Fi^^n-i- 

Proof. When i = n, the inequality reads > — -F„^o,o + -^n,o,o- ^ = n — 1, it reads > 
-fn-1,0,1- So suppose £ < n — 2. 

We show by induction that for £ + 1 < k < n, 

k 

fiSli] U 0|,}) > U 0{,+i,...,fc}) + {k-£- l)/(5[,]). 

i=i+l 

The case k = £ + 1 is trivial. For the induction step, we have: 
fc+i 

fiSii] U 0{i}) > U 0{fc+i}) + U 0{£+i,...,fc}) + {k-£- l)/(5[,]) 

i=£+l 

> /(%] U 0{,+i,...,fc+i|) + /(%]) + {k-£- 

where the ffi'st inequality follows from the induction hypothesis, and the second from submodularity 
of /. Taking k = n and then averaging over all permutations of indices yields the required inequality. 

□ 

Lemma 4.5. For £ satisfying 0<£<n, Fi^^n-i + ^-^^-1,0,1 > ^^£-1,0,0 + (1 - c)F£,o,o + /(O). 

Proof. When £ = 0, the inequality reads -Fo,o,n > (1 — c)Fo^o,o + f{0), which is true since -Fo,o,n = 
f{0) and Fo,o,o = /(0) = 0. So suppose £>1. 
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Let C = {i (z [i] : Si = Oj}. We show by induction that for < k < i, 

k 

/(%] U 0{£+l,...,n}) + X] ^ ^{*}) 

i=l 

k 

> fiSli] u 0[fc]u{^+i,...,„}) + fi^wm) + (1 - c) E /({^^})- (3) 

j=i iecn[fc] 

The case A; = is trivial. For the induction step, there are two cases. If Si ^ Oi then i ^ C . In this 
case, 

k+l 

f{S[i\ U 0{f+i,...,n}) + X] ^ 

> /('S'[^]\{fc+i} U 0{k+i}) + f{S[i\ U 0[fc]u{£+i,. ..,„}) + J]] f{S[i\\{i}) + (1 - c) ^ /({•Si}) 

i=i jecn[fe] 

k 

> f{Syp^ U C»[fe+i]u{£+i,...,n}) + f{S[i]\{k+i}) + Y f{S[t]\{i\) + (1 - c) X] 

i=i iGCn[fc] 

where the first inequahty fohows from the induction hypothesis, and the second from submodularity 
of/. 

If Si = Oi then i G C. In this case, 
fc+i 

f{S[t] U 0{£+i^,,,^„}) + ^ f{S[e\\{i) U 0{i}) 
1=1 

> /(5'[f]\{fc+i} U 0{fc+i}) + /('S'm U 0[fc]u{^+i,...,n}) + X] /(-^MVW) + (1 - -^({^^i) 

j=i jecn[fc] 

> f{S[,] U 0[,.+i]u{^+i,...,n}) + /(*5m) + E + (1 - E 

4=1 iGCn[fc] 

> /(5[,] U 0[fc+i]u{^+i,...,n}) + /(S^ufc+i}) + (1 - c)/({sfe+i}) + E /(5m\w) + (1 - E 

i=i iecn[fc] 

where the first inequahty fohows from the induction hypothesis, the second from submodularity of 
/ and the last from curvature of /. 
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Taking k = £ in we have: 

e 

f{S[e] U 0{^+i,. ..,„}) + ^ f{S[e]\{i} U 0{j}) 

i=l 

> /(%] U OMu{^+i,...,n}) + E fi^Wm) + (1 - c) E 

e 

> /(0[,]u{£+i,...,n}) + (1 - c)/(5[,]\c) + E /(^MUa) + (1 - c) E /(-t^*}) 

1=1 i&C 

I 

>/(0) + (l-c)/(5[,]) + E/(^M\w), 

where the second inequahty follows from curvature of / and the final line follows from submodularity 
of /. Averaging over all permutations of indices then gives the claimed inequality. □ 

Proof of Lemma Adding the inequalities from Lemmas 14.41 and 14.51 gives 

(n - ^)F^,o,i + > + {n-l- c)Ftflfl + f{0). □ 

Theorem 4.6. Suppose O is a global optimum of f and that S is an ei- approximate local optimum 
of g. Then, 

(1 + 86inlogn)7i"i/(5) > c-^ItI^ - t^)f{0). 

Proof. Consider the inequality given in Lemma [4. 2 i From monotonicity of /, we have -Ffc,o,o ^ -^n,o,o 
for all k < n. Using Lemma IE. 21 from Appendix [El which gives an upper bound on the value 
an = Ylk=i l^n^ ' "^^ obtain: 

n n 

einE/5i"^^fe,o,o < ei"- E /5fc"''-^nA0 < 8einlognF„,o,o < 8ei7^"\n log n F„,o,o 

fe=l k=l 

where in the last line we have used the fact that 7!"^^^ > tI""*"^^ = 1 from Lemma [3. 11 Furthermore, 

since -Fo,o,o = /(0) = and F_i_o,i = 0, we have 7o"^(-Fo,o,o — -^-1,0,0) = 0, and so the inequality in 
Lemma 14.21 gives 

n 

8ei7i+V^ log n F„,o,o + E ^^"^ (^^,0,0 " ^fc-1,0,1) > 0. (4) 

Since by Lemma [3. II part (b), 7^"^ is non-decreasing, we have c~^{'y^^^ "tI"^) — ^'^^ all < ^ < n, 
and multiplying the inequality from Lemma 14.31 by c~^{'y^^-^ — 7^"^) gives 

c-H7t\ - ^T^) [(^ - ^)^^.o,i + - m-1,0,0 -{n-i- c)F,,o,o] > c"^(7i+l - 7^ 

(5) 

for £ € {0, . . . ,n}. We claim that the inequality that results from adding the n + 1 inequalities 
given by ([5]) to inequality is the desired inequality. 
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We first consider all terms of the form -Ffc,o,o- For < A: < n — 1, the coefficient of i*fc,o,o is 

7^^ -c-Hn-k- c)(7a - 7i"^) - c-Hk + l)(7a - 7^^) 

= c-\n - k)ji^^ +c-\2k-n + c+ l^k^, - c-\k + 1)7(^^2 = 0. 
where the final equality follows from ( j7-RECi ). The coefficient of -Fn,o,o = f{S) is 
8ei7i+\rilogn + 7!"^ + c-'c{^^l - it^) = (1 + 8e,nlognhi% 
Now, we consider all terms of the form -F/fc_i,o.i- For 1 < A; < n, the coefficient of F/^^i q i is 

- 7^^ +c-\n-k + l)(7r) - 7t\) + c-W:i, - Ti"^) = 

c-'A:7^:;\ -c-\2k-n + c- 1)7^^ - c-^(n - + 1)7^ = 0, 



where the final equality follows from ( j7-REC| ). Additionally, -F_i^o,i = by definition. Finally, the 
coefficient of f{0) is 



E--'(7a-7r) = c-^(7a-7r)- □ 

fc=0 



4.2 Approximation ratio 

Now, we must translate the bounds that Theorem 14.61 gives for the locality gap into a performance 
guarantee for Algorithm [TJ We already have 7q"^ = 1 by Lemma l3. II (c). We now bound 7^'^i. In 
order to do this, we make use of a surprising connection between the explicit formula ( i7-CLOSEDj ) 
and the Pade approximants to the function e^, defined in (26l §66]. For /i, > 0, the (/i, i')-Pade 
approximant to e^' is given by -R/^,;/ = whose numerator and denominator are defined by: 

_ A xfe(/i + z/-fc)!/i! ^ ^ {-x)Hfi + 1^ - k)\u\ 

Both P^^u{x) and Q^^u{x) are positive (for the latter, note that each term in the alternating sum is 
of smaller magnitude than the preceding one) . Additionally, we have the following formula from |26t 
§66]: 

We can restate the explicit formula ( i7-CLOSEDj ) in terms of Pade numerators and denominators as 
7^+; = {-lfm\c-^{^~^^ [Qm-i-iA'^)E - P^-i-,,i{c)] , (7) 

for all m > 1, and 1 < £ < m, where E = "fl^Ji- This is proved formally in Lemma IC. 51 found in 
Appendix O Using Equation ([6]), we derive the following bound on 7^*^^. 

(n) 

Lemma 4.7. 7„_)_x > e^. 
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Proof. By construction, 7g^"''' = 7^+1'' and so 7^^^^^"^^ = 0. Thus formula ([7]) implies that E = 
R2n-q,q{c)- Since q is odd, expression ^ immediately implies that E > e^. □ 

As an aside, we note that ^ implies that if we put E = and use ( i7-CLOSEDj ) to define ^^"^^ , 
then 7(™') is non-decreasing for all m. 

Before stating our main theorems, we need the following elementary inequality. 

Lemma 4.8. Let 5 < 1/2. Suppose that \g{A) - g{A)\ < 6g{A), \g{B) - g{B)\ < 5g{B) and 
(1 + 5yg{A) > g{B). Then (1 + 76)g{A) > g{B). 

Proof. The premises imply 

(1 + 5fg{A) > (1 + 5yg{A) > ~g{B) > (1 - 5)g{B). 

Therefore 

The expression on the left is bounded by 

1 — 1 — 

Since the function (3 + 6) /{I — 5) is increasing and 5 < 1/2. □ 

We also need to know the approximation ratio of the greedy algorithm when the oracle for the 
function is only approximate. Similar results appear in Goundan and Schulz [T7] and Calinescu 
et al. [6], though their measure of approximation for the oracle is different. 

Lemma 4.9. Let 5init = {'S'l, . . . , Sn\ satisfy 

{l + 'n)g{S^k])>9{S[k^i]^{x}) 

for all k € [n] and all x G U such that S^/^^ij U {x} S I. Let g* be the maximum value taken by g 
on Z. Then 

9* 



g{Sinit) > 



2 + nr] 



Proof. During the lemma, we will use the fact that g is monotone and submodular whenever all 
sets involved are of cardinality at most 2n. 

Suppose g* = g{0), where O = {Oi, . . . , 0„} and U 0{fe} G T for all k G [n]. Then 

n n 

(1 + v)Y.siSlk]) > Y.9iSik-i] U 0|,}). 

k=l k=l 
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Since g{S[k]) < 5(5'init), this implies 

n 

(1 + nr])g{Sinit) = nr]g{Sinit) + J2^3{S[k]) - g{S[k-i])] 

k=i 

n 

> ^[9{S[k-^i] U 0{fc}) - g{S[k-i])] 

k=l 
n 

>Y.^9{Sin^^O{k})-g{Sin^)] 
k=l 

> g{Sinit U O) - g{Sinit) 
>5(0) -5(5i„it). 

Rearranging, 

(2 + n7?)5(5i„it) >5(0). □ 
Now, we are ready to state our main theorems regarding the performance of Algorithm [H 
Theorem 4.10. Given < e < 1 and c G (0, 1], set the parameters of Algorithm{l\ as follows: 

£ 9 2 4 / -1 2 

eo = : , N = 64:6 n log nlog(60e n nlogn). 

56n log n 

With probability 1 — o(l), Algorithm\^ is a ^— | e approximation algorithm, running in time 

Proof. We analyze the algorithm under the assumption that whenever we evaluate g{S), the value 
we obtain satisfies 

{l-eo)g{X)<~g{X)<{l + eo)g{X). 

Later we will show that this happens with probability 1 — o(l). 

Let O be the optimal solution for the instance we are considering and let S be the solution 
produced by the algorithm. Lemma 14.81 shows that S" is a 7eo-approximate local optimum of g. 
Theorem 14.61 implies that 

(1 + 56eonlogn)/(5) > l^l/^/(0). 

c 

Lemma 14.71 completes the proof of the approximation ratio of the algorithm. 

We now bound the number of improvements our algorithm can make. Let g* be the maximum 
value taken by g on I. Applying Lemma 14.91 with 1 + r/ = (1 + eo)/(l — eo) < 1 + 3eo (using 
eo < 1/56), we deduce (this time using eg < l/(56n)) 

~g{S^n:) > (1 - eo)5(5.nit) > (1 - eo)j^^ > 

Every time the algorithm applies an improvement, it must improve g{S) by at least a factor of 
(1 + eo). Furthermore, g{S) < (1 + eo)^* for all S we query. Thus, the number of improvements 
Algorithm [1] can make is at most: 

logi+,, < logi+,„(l + eo)(2 + 8?ieo) < logi+,„(2 + llneo) < e^^ = 56e~\ilogn. 
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The second inequahty follows from the bound eo < l/(56n). 

Finally, we derive a bound on the number of samples needed to ensure that with high probability 
\g{X) — g{X)\ < eQg{X) for all sets considered by the algorithm. The initial greedy step requires at 
most nu total evaluations of g, and each improvement step requires at most nu evaluations. Thus, 
the algorithm requires at most G = 60e~^ri^ulogn total evaluations of g. Define 

= 64eo ^ log^ n log G = 64e-'^n'^ log^ n log G. 

Lemma 13.31 shows that the probability that for a given set X, \g{X) — g{X)\ > eog{X), is at most 
Hence this never happens for any set considered by the algorithm with probability at least 
l-2/G = l-o(l). 

The final algorithm requires a total of 0(eQ ^n^n) calls to the value oracle for / and 0{eQ^'n?u log n) 
calls to the independence oracle for A4. Its runtime is proportional to the total number of oracle 
calls to /. □ 



4.3 Clean approximation 

Algorithm [T] has two shortcomings: it requires prior knowledge of c, and it only approximates the 
optimal approximation ratio (1 — e~'^)/c. In this subsection we show how to overcome each of 
these shortcomings individually. Combining the two techniques together, we get a clean (1 — e~^)/c 
approximation (for technical reasons, we need to assume a lower bound on c). 

We can remove the e from our approximation ratio by using a partial enumeration technique 
described by Khuller et al. [21] and employed by Calinescu et al. [5]. Effectively, we try to "guess" 
a single set in the optimal solution, and then run Algorithm [1] on an instance in which all solutions 
contain this set. We then iterate over all possible guesses. 

Formally, for a matroid M = {U,X) and an element x (zU, the contracted matroid A4/x is a 
matroid onU\ {x} in which a set A is independent if and only if ^ U {x} S I. Similarly, for x (zU 
we define the contracted function fx{A) onU \ {x} by fx{A) = f{A U {x}) — f{{x}) and note that 
if / is monotone submodular, then so is fx- 

We can now formulate the new algorithm. Algorithm [2] simply runs Algorithm [1] with suitable 
parameters on the instance M/x,fx for each x (z T, and returns the best resulting solution. The 
algorithm uses the function 




which gives the approximation ratio for given curvature. 



Input: M = {U,I), /, c 
Set 

e = ^-— eo = 1 , = 64e-27i2log^nlog(60e~^n\logn). 

n 56n log n 

for X GU do 

Let Sx be the result of running Algorithm [1] on {M./x, fx,c,eo, N); 
Let y = argmax3,g;^ f{Sx U {x}); 
return Sy U {y} 

Algorithm 2: Clean (1 — e~^)/c approximation algorithm 
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Lemma 4.11. If Algorithm [Jl with parameters as in Algorithmic has an approximation ratio of 
9 on matroids of rank n — 1, then Algorithmic has an approximation ratio o/ 1/n + (1 — l/n)6 on 
matroids of rank n. 

Proof. Consider an instance {M,f), where n is the rank of M. Let O be some optimal solution 
for this instance, and y = argmaXa;go f{{x})- Submodularity implies that /(O) < ^^eo fii^}) — 
nf{{y}). Furthermore, since Algorithm [T] is a ^-approximation algorithm, we have 

fy{Sy)>efy{0\{y}) = f{0). 

Let S be the solution produced by Algorithm [2] on the instance {M,f). Then, 

fis) = f{{y}) + fyiSy) > /(M) + efyio \ M) = /(M) + ^(/(o) - /(M)) 

= (1 - e)f{{y}) + ofio) > + ^"j f{0) =(- + ^e) f{0). □ 

Theorem 4.12. Let c G (0, 1], and suppose f has restricted curvature c. With probability 1 — o(l), 
Algorithmic is a (1 — e~^) / c- approximation algorithm running in time OdnJu^)- 

Proof. Theorem I4.1UI implies that each instance of Algorithm [T] run within Algorithm [2] has an 
approximation ratio of (1 — e~'^)/c — e, where e = (1 — p{c))/n. From Lemma 14.111 Algorithm [2] is 
then a 

i + U_WW)=±+(l_±),(e)>„.) 

n \ nj \ n J \ n-^ J 

approximation. Algorithm [2] calls Algorithm [1] a total of n times with = 0(n/(l — p{c))). Its 
runtime is hence 0((1 — p{c))~'^7i^u). □ 

Algorithm [T] requires knowledge of c. We can eliminate this need by enumerating over possible 
values of c with enough granularity. Since the function p{c) is continuous, an error in estimating c 
will only result in a small decrease in the resulting approximation ratio. 



Input: M = {U,I), f, e 
Set 

£2 = 7:, eo = TTT-r > iV = 64e7^ra^ log^ nlog(60e7^n^u log n) 

2 oonlogn 

Let C = {/ce : 1 < A; < [1/eJ} U {1} ; 
for c € C do 

Let Sc be the result of running Algorithm [T] on {^A, /, c, eo,N); 
return ar gmax^g /(S'c) 



Algorithm 3: Algorithm not requiring prior knowledge of the curvature 



Theorem 4.13. Suppose f has curvature c. With probability 1 — o(l), AlgorithmlCis a (1 — e '^)/c — 
e- approximation algorithm running in time 0{€~^n^u). 

Proof. Let cq = min{x £ C : x > c}. Clearly c < cq < c + e. Since / has curvature at most cq, 
Theorem 14.101 shows that f{Sco)/f{0) > p(co) — e/2, where O is a global optimum. Elementary 
calculus shows that for x G [0, 1], —1/2 < p'{x) < —1/4. Therefore the approximation ratio of the 
algorithm is at least p{co) — e/2 > p{c + e) — e/2 > p{c) — e. Finally, we run Algorithm [T] a total of 
[l/e] times, leading to a running time of 0(e~^n'^n). □ 
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The methods of Algorithms [2] and [3] can be put together to obtain a clean (1 — e~'^)/c- 
approximation algorithm that doesn't require knowledge of c. The reader who is interested in 
carrying out the construction will discover that for technical reasons, this approach only works 
given some lower bound on c. 

5 Future work 

An immediate open question is whether our algorithm can be made deterministic, even if only for 
particular classes of functions. If / is given a coverage function, we have already shown [U] that g 
can be computed explicitly. However, this result required access to the representation of / and so 
is not possible in the general value oracle model. Even reducing the amount of sampling needed to 
compute g would be useful, as it would improve the runtime of the algorithm. 

A more general question is whether this approach can be extended to other submodular maxi- 
mization problems, including non-monotone maximization or monotone maximization over multiple 
matroid constraints. A major difficulty in extending the continuous greedy algorithm to this latter 
case is that the pipage rounding phase does not generalize to multiple matroids. As our general 
technique does not require any rounding, it is a natural candidate for improvement in this area. 

Finally, we ask whether it is possible to match the improved performance of the continuous 
greedy algorithm for other problems or restricted settings by combinatorial algorithms similar to 
our own. Our algorithm as it stands already matches the performance of the continuous greedy 
algorithm not only in the classical case but also when the objective function has restricted curvature. 
It is unclear, however, whether our algorithm may match or improve results for other applications 
of the continuous greedy algorithm, such as those presented in |llj . 
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Appendices 



A Coverage functions 

In our previous paper [H], we discuss the case where / is a coverage function. In this case, U 
is a collection of subsets of some other set V. The set V has an associated non-negative weight 
function w, and the function / is the total weight of elements covered by the sets in its input. The 
non-oblivious potential function used in [H] is 

x&V 

where hx{S) = |^ S S" : x S ^| is the number of sets containing x, and the sequence an is given by 

ao = 0, ai = E — 1, an+i = {H + l)aH — Hau-i — 1- (8) 

In [T3] we give a concrete value for E. 

Even though the function G is presented quite differently from the function g defined in the 
present paper, the two functions coincide when c = 1. 

Lemma A.l. Suppose that f is a coverage function. For any set S, 

g{S) = {E-l)G{S). 

Proof. We will show that 



9{S) = J] oih^{S)w{x). 



Prom the definition of g, it is immediate that there exist constants C//"^ such that 



A priori, the coefficients c![^^ depend on m. However, our definition of g ensures that this is not the 
case. Consider the following thought experiment. Add to U an element consisting of the empty 
set. For every A C 5 we have /(Au{0}) = f{A), and so Lemma |E2 implies that g{S\J{%]) = g{S). 
On the other hand, 

5(5U{0}) = ^CSV^)- 

X&V 

Since S is arbitrary, we must have = C,^^\ Denoting the common value of all these sequences 

by we have 

9{S) = ^ Ch^i^s)w{x). 

x&V 

Our task is now reduced to showing C, = a. 

We can get a nice formula for Qh by considering the set Sh = {^i : 1 < « < H}, where Ai = {x} 
and X is an element of unit weight. Clearly giSn) = Ch- On the other hand, 

H H 
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Therefore 

H 



c. = E/5r- (9) 



k=l 



It remains to deduce recurrence ([8]) for C,. The first step is to get another expression for C,h- 
Consider the same set 5^ used previously. Taking Sh U {0} this time, we get 



H+l rr H+l 



Therefore 



k=2 



Equations (j9|). (fT0|) together imply that for every H, 

CH+i = C/f + -|— Y- (11) 

Recurrence (|lip enables us to prove recurrence ([8]) for C: 

Ch+1 = Ch + \— = Qh + 1^^^ -1 = Ch + H{Ch - Ch~i) -1 = {H + 1)Ch - HCh-i - 1. 

The first inequality follows from (|lip . the second from applying ( |7-up| ) to ^i^^^^ and using the 
fact that 7q =1. The third inequality follows from applying (|lip to 7{ , and the final 
inequality from simple algebra. The base cases for the recurrence follow directly from ([9]): <^o = 
and Ci = =E-l. □ 



B Running time of the continuous greedy algorithm 

In this section, we analyze the running time of the continuous greedy algorithm [6], as well as a 
variant which replaces pipage rounding with swap rounding [7]. The algorithm is composed of two 
parts: the continuous greedy phase and the rounding phase. The second phase is not needed for 
partition matroids. 

We will use Tf to denote the time required to evaluate an oracle call to the submodular function 
/, and Ta4 to denote the time required to evaluate an oracle call to the matroid independence oracle. 
Also, n will be the rank of the matroid u will be the size of the universe, and e > will be a 
parameter such that the resulting approximation ratio is 1 — 1/e — e. 

The continuous greedy algorithm uses a continuous relaxation F of the submodular function /. 
The relaxation F: [0, 1]*^ ^ M is defined as follows: 

F(y)=E[/(y)], 

where y is a random subset of U which contains each j U with probability yj. 
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Continuous greedy phase The goal of this part is to compute a vector y G [0, 1]^ such that 
F[y) > (1 — 1/e — e)OPT. Let 5 = e/3n. We perform 1/5 iterations consisting of the fohowing two 
steps: 

1. For each j € [/, estimate ojj = K[f{y U {j}) — f{y)] by averaging over 0{5~'^ logu) samples. 

2. Find a maximum- weight independent set in A4 with weights uij, and modify y accordingly. 

The first step takes time 0{5~'^uloguTf). The second step takes time 0{uTm). The total running 
time of this phase is 

0(e"^?i^ulog uTf + e'^duTM)- 

Rounding phase This stage rounds y into an integral solution z satisfying f{z) > F{y). The 
original paper [6] implements this stage using pipage rounding. Chekuri et al. [7] suggested a 
simpler and faster implementation, swap rounding. While their algorithm is randomized and only 
produces a good solution in expectation and with high probability, it is easy to turn it into a 
deterministic algorithm which always succeeds (this is already mentioned in their paper). 

Pipage rounding Pipage rounding consists of up to v?' iterations of the basic subroutine 
Hit Constraint. Each call to HitConstraint requires solving the following problem: 

5 = min(r^i(^) - y{A)), A = {A Q U : i e A, j ^ A}. 

Here i,j are parameters, rj^i^) is the maximal rank of a subset of A, and y{A) = ^i^j^yi- The 
time it takes to evaluate r^(j4) — y{A) is = 0{uT_m). 

The paper suggests several methods for finding 6, which amounts to submodular minimization. 
Among the combinatorial methods, the fastest seems to be the one by Fleischer, Fujishige and 
Iwata |19] . which runs in time 0(n^ log nT^). Another method, by Grotschel, Lovasz and Schri- 
jver [18], uses the Ellipsoid algorithm with a separation oracle. While this particular approach isn't 
faster than the combinatorial methods, similar approaches might be, such as using interior-point 
methods or multiplicative weights. Unfortunately, we have been unable to find a running time 
analysis in the literature of such methods in this context. 

Summarizing, the running time of this phase is 

0{u^\ognTM)- 

Swap rounding Swap rounding consists of an initialization phase follows by u iterations of 
the basic subroutine MergeBases. The object of the initialization phase is to present y as a convex 
combination of at most u bases. According to [7], this can be done in time 0{u^T^). MergeBases 
consists of up to n iterations. Each such iteration takes time 0{nT_M + Tf). Thus the running time 
is dominated by the initialization phase, whose running time is 

0{u'Tm). 
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C Properties of the 7 sequence 

In this section, we prove the following properties of the 7 sequences defined in Section [3j 
Lemma 3.1. The sequences j^"^'^ = 7o™\ • • • satisfy the following properties: 

(a) satisfies recurrence ( i7-REC| ) for all m > 1. 

(b) 7^™) is non- decreasing and non-negative for < m < 2n. 

(c) 7^™"^) and 7*^™) satisfy both ( i7-DOWN| and (jT^UPj) for all m > 1. 

Recall that we defined q = 2 |_^^J + 1 and set 79^"^ = 7q^i = 1, then used the recurrence 

= (2£-m + c-l)7^™^ + (m-^ + l)7^™J, l<i<m ( I7-RECD 

with m = 2n to define 7o^"\ • • • ,7q^"i\7g^2*' • • • '72n+i- Next, we normalized this sequence by 
dividing by 79^"^ and then used the recurrences 



(m—l) (m) , C \ ^ fm) /. 



fc=i 

7lS=c-^n(7i™-^)-7r-^)), V=lt-'\ 7^1 =7^"-^) GEupJ 

to generate the sequences ^^^^ for 1 < m < 2n and m > 2n. We now turn to proving various 
properties of the resulting sequences. In order to prove Lemma I3.H we first prove the following 
three lemmas, which will serve as the base case, downward induction step and upward induction 
step, respectively, of the proof for Lemma |C.2[ 

Lemma C.l. The sequence 7(^'") = 7o^"\ . . . ,72n+i non- decreasing and non-negative. 

Proof. Our proof makes use of the following inequalities, which are immediate from the definition 
of q: 

n — 1 < q < n 

We consider the sequence 7^^") before it has been normalized by dividing by 7o^"^ In order to 
prove the lemma, it is sufficient to show that the unnormalized sequence is non-decreasing and 
^(2n) ^ Consider the sequence 7(2™+^), and note that, for < ^ < 2?i we have 7^^"'''^^ > if and 
only if 7^^"^ — 7^^"^ > 0. We prove that 7^^"^"^^ > by induction for £, separately for £ > q->rl and 

for £ < q — 1. By definition, 7^^]^^^ = 0, since 7g^"^ = 7q^i = 1- 

We begin with the case i > q -\- 1. Recurrences ( i7-REC| ) and ( i7-upj ) imply that for i such that 
1 < ^ < 2n, 

c£7£r ^ = (2n + 1) (^7lS-£7r) 



(2n + 1) ((2^ -2n + c- 1)7^"^ + {2n - i + 1)7^!",) - ^7^ 
(2n + 1) - 2n + c - l)(7f - -/t^) + c-ff^ 
c(i-2n + c- l)7f + c(2n + 1)7^^. 
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Hence, 7^^"^"^^ > if and only if 



(2n + l)7S^ > (2n-£-c+l)7f"+'^ (12) 

If £ = q + 1 then (fT2]) clearly holds since 7^^]^^^^ = and 7g^"^ = 1. So we assume that £ > q + 2. 
We have: 

c{£ - l)7f = c(£ - 2n + c - 2)715+'^ + c(2n + 1)7^ < c(2n + 1)7^ < c(2n + 1)7(5). 

Where we have used £ + c < 2n + 2 and the induction hypothesis ^f^-^^^ > 0, which implies 

le^2 — '^l-"^' induction hypothesis, we further have 7^^"^^^ > 0. Therefore (jl2p follows 

from £>q + 2>n+l, which implies 2n — £ — c+1 <n — c<n<£— 1. 

Consider next the case £ < q — 1. Recurrence ( j7-REC| ) implies that for £ such that < £ < 2n — l, 

c{i + l)7£r'^ = (2n + 1) {{£ + 1)7^? -{£ + Ihill) 

= (2n + 1) ({2£ -2n + c+ 1)7^^ + (2n - £hf''^) - {£ + l^if^) 



or, equivalently. 



= (2n + 1) [{£ - 2n)(7£) - 7? "^) + 07^5) 
= c(£ - 2n)7gr ') + c(2n + 1)7£\ 

(2n - £)7£r^^ = (2n + 1)7^) - {£ + Ih^t'^ . 



This shows that 'y^^'^-i^^^ > if and only if 



(2n + l)7g^ >(£+l)7gr^ (13) 

If ^ = g — 1 then ()13p clearly holds since j^^^^^ = and 75^"^ = 1. So we assume that £ < q — 2. 
We have: 

{2n-£- l)7£r^^ = (2n + 1)7^^^ - (^ + 2)7£+^) < (2n + 1)7^, 

since 7^^'^'''^'* > by the induction hypothesis. Subtracting c^^f^^^^ = (2n + 1)(7^^'^) —l^i^})^ which 
follows from ( j7-upj ), we obtain 

(2n-^-l-c)7gr^)<(2n + 1)7(5- 

From the induction hypothesis, 7^^^^^ > 0. Therefore, (jlSp follows from £<q — 2<n — 2, which 
implies that £ + 1 < n — 1 < n + 1 — c < 2n — £ — 1 — c. 

We have shown that 7^ "^^^ > for aU 1 < £ < 2n + 1, and so ^ and ([13]) imply that 7^ > 
for 1 < £ <2n + 1. We now show that 79^"^ > as well. From the recurrence ( i7-REC| ) we have: 



= (1 + c - 2n)7r"' + 2n7^ 



Since 72^"^ > and 71^"^ > 0, 



_ (2n-i-c)7r+7r > □ 
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Lemma C.2. Suppose 7'-"^^ = 79"^^ • • • ^jln+i ^ non- decreasing, non-negative sequence satisfying 
( i7-RECj ), and define -y^^^^^i = 7q"^~"^'', . . . ,7^"^^ using i\^-DOWN^ . Then, 

(a) Ifm>2, the sequence 7(™'~^) satisfies recurrence ( j7-RECi ). 

(b) T/ie sequences 'y^'^^ and satisfy the recurrence ( j7-up| ). 

(c) T/ie sequence is non- decreasing and non-negative. 
Proof. For part (a), let A; satisfy 1 < k < m — 1. We must show 

= (2^^ - m + 0)7!™-'^ + {m- kht;'\ 

After applying ( j7-DOWND and multiplying by m/c, this is equivalent to 

k-l 

^tI+I = (k -7n + c)7^™^ + 7^,""^ + 771.7^"'^ 



k=l 



Summing the recurrence ( j7-RECj ) for i = 1, . . . ,k yields exactly this equation. 

For part (b), we begin by considering the base cases for recurrence ( j7-up| ). It follows directly 
from ( I7-DOWN] ) that 7q™ = 7o"^^ Summing recurrence ( j7-RECj ) for j^"^^ for £ = 1, . . . ,m, we get 



m 

(m) (m) , \ ^ (m) (m-1) 

mY^^^ = m% ' +c2_^% ' = m7^ > 
k=l 

Next, we must show that 



(m) —1 / (m— 1) (m~l) 

7i+i =c m (^7;^^ - 7; 
for 1 < i < m — 1. After applying ( j7-DOWN| ) and simplifying, this is equivalent to showing 

(m) —1 C Im) 

For part (c), we note that 7^™) is non-negative, and from part (b) we have 7^™] > if and only 
if 7^^^ — 7^™ > 0. Thus, the sequence 7^"^"-^) is non-decreasing. Furthermore, from part (a), 
^{m 1) _ ji^) ^ which is non-negative. Thus, the entire sequence 7(™~^) is non-negative. □ 

Lemma C.3. Suppose 7(''"~^) = 7^™ "^^...,7^ is a sequence satisfying recurrence ( |7-REC| ). 
Define 7^™) = 70"^^ . . . ,7^^^ using ( j7-up| ). Then, 

(a) The sequence 7^™') satisfies recurrence ( i7-REC| ). 

(b) T/ie sequences ^^'^^ and satisfy the recurrence ( |7-down| ). 
Proof. We start by proving part (a). When £ = 1, we have 

^(-) = c-im(7f - 7^"'^) by ( jTuPj ) 

= c-^m((2 - ?n + c)7j™"^^ + (m - 1)7^™"^^ - 7j™"^^) by dTRECj ) 

= c~^m((l — ?n + 0)7!'" — (1 — m + c)7q'" + cJq^ by algebra 

= (1 — m + cj7]; + m^Q by ( |7-UPp 

= (1 - m + c)7|;'"'' + 7717^'"'* by ( i7-UPl ). 
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When £ = m, we have 



i'm) (m—1) 



c-'m{{m - 1 + c)jt-'^ - (m - 2 + c)j^Z~i^ - j^Z~2^) 

lm-2 



= (m-l + c)7t)+7a 
When 2 < i < m — 1, we have 

£7S=c-im(£7l:r'^-^7f-'^) 

= c-im((2£ - m + 0)7^'"-^) + (m - £)7^™-'^ - Ij^""-'^) 

= c~^m{{2l -m + c- Iht"'^^ + (m - 1)^1'^^;^^ -{I- l^t"'^^) 

= c~^m{{2i -m + c- 1)7^"""^^ + (m - ^)7^^™"^^ 

-{21 -m + c- 2)7,^™-') - (m - ^ + l)7j"2"'^) 
= c-^m{{2£ -m + c- 1)7^"""^^ + (m - ^ + l)7lT^^ 

-{2i-m + c- 1)7&"'^ - (m - £ + l)7i"2~'^) 
= (2£ - m + c - 1)7^"'^ + (m - £ + 1)7^!^; . 

For part (b) we must show that for < i < m, 



by ( i7-upj ) 
by algebra 
by ( i7-REui ) 
by algebra 
by ( i7-upj ) twice. 

by doMJPD 
by ( |7-REci ) 
by algebra 
by ( |7-REC| ) 

by algebra 



by (j7-UP[) twice 



7i 



(m-l) 



(m) , C \ ^ (, 

7^ ^ + 



(m) 



fc=l 



Applying ( j7-up| ), this is equivalent to showing that: 



k=l 

We are now ready to prove the properties of the 7 sequences stated in Lemma |3. II 



□ 



Proof of Lemma \3.1[ By definition, 7(2"-) satisfies recurrence ( i7-REC| ). Inductively applying Lemma 
IC.2I (a) for m = 2n, 2n — 1, . . . , 1 and Lemma IC. 31 (a) for m = 2n + 1, 2n + 2, . . . then completes the 
proof of part (a). 

By Lemma IC-H 7(^") is non-decreasing and non- negative. Inductively applying Lemma IC. 21 (c) 
for m = 2n, 2n — 1, . . . , 1 then proves part (b). 

For part (c), we note that for all sequences with m < 2n, ( j7-DOWN| ) holds by 

construction. Inductively applying Lemma IC.2I (b) then shows that ( |7-upj ) holds. Similarly, if 
m > 2n, then ( |7-up| ) holds by construction and inductively applying Lemma IC.3I (b) shows that 
( i7-DOWNj ) holds. □ 



24 



Lemma 3.2. For all m >0, the terms of j^"^^ are given by the following formula: 

m— 1 



(m) 

Ti+i 



k—rn 



k=0 



m — 1 — k 

e-k 



E 



m — 1 — k 



(7-CLOSED) 



where E = and < £ < m — 1. 



Proof. We proceed by induction on m. When m = 0, the statement is trivial, as we have only 
7q'^^ = 1 and 7^*^^ = E. If m > 1, then by Lemma |3.H the sequence 7(™'~^) is related to 7^™) by 
( i7-upj ). Applying the induction hypothesis to 7(™-~i), we find that Ji^l is equal to 



— 1 I (m—l) (m—1) 

c rn(7;^^ -7; 



m-2 , X, 
A:=0 



m — 2 — k 



£-k 



E 



m — 2 — k 



m-2 , s. 



m-2 



fc=0 
m-2 



k-m 

k\ 



fc=0 



k—m 



■C 



fc=0 



k\ 



m — 2 — k 

£-k 

m — 1 — k 

£-k 



m — 2 — k 



m — 2 — k 
£-1 



+ 



E 



m — 2 — k 
£-l-k 

m — 1 — k 



E 



m — 2 — k 



+ 



m — 2 — k 
£-1 



If 1 < i < m — 2, then we have both = and (™ ^ j'" = 0, and so: 



■m-2 



7&; = (-1)^ E t'"'"" 



k=0 
m—l 



(-1/E 



A;=0 



IT 



k—m 



(-1)' 
(-1)' 



m — 1 — k 
£-k 

m — 1 — k 
£-k 



E 



It remains to show that the formula holds for £ = and for 
( i7-CLOSED| ) gives 



m — 1 — k 

£ 

m — 1 — k 

£ 



m — l. In the case of £ = 0, 



(m) 

7i 



■m—l 



-i)°E 



fc=0 



ml 



.k—m 



{-If 



m — 1 — k 



E 



m — 1 — k 




m—l 



mlc 



k=0 



ml 



k—m 
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while using the recurrence ( |7-UPj ) and the induction hypothesis, we obtain 



(m) -1 / (m-1) _l / (m.-l) , 

'fl = c 771 I 7^ To I ~ ^ m [jI — 1 



m-2 



k=0 
m-2 



k\ 



m — 2 — k 




c m 



mlc 



k=0 
m—l 



ki 



1 ^ \ ^ ml 

^ A:! 



k—m 



k=0 



In the case £ = m — I, ( j7-CLOSE"DD gives 



m— 1 



k—m 



k=0 



, /' m — 1 — k 
m — 1 — k 



E 



m — 1 — k 
m — l 



(-1) 



m—l 



m—l 



. fc=0 



m'.c 



while using the recurrence ( j7-UPi ) and the induction hypothesis, we obtain 



_ -1 / (m-l) _ 
7m — <- "M 7m 7m,- 1 



-1 f t:^ 
C m[E- f^^^ 



m-2 



c-^mE - c-^m{-ir-^ ^ 



c-^mE + {-l) 



m—l 



k=0 
''m-2 



(m- 1)! 
kl 



m — 2 — k 
m — 2 — k 



E 



m — 2 — k 
m — 2 



(-1) 



m—l 



''m—l 



.fc=0 



m! 



□ 



The next identity will be useful in proving some of the remaining results in the Appendix. 
Lemma C.4. For all m > I: 

m 

E(" 



(m) (1) 



k=l 



Proof. We have 



(m.) 
fk 

k=l k=l 



where the first and third equalities follow from Lemma l3.ll (c). Furthermore, recurrence ( i7-RECj ) 
for m = 1 states that 72^^ — 7q^^ = c^[^^ . □ 
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Finally, we briefly verify the equivalence between the explicit formula ( j7-CLOSEDj ) and the 
expression ([7]) from Section [4.21 



Lemma C.5. 



= (-l)^m!c- 



m — 1 



[Qm-l-l,l{c)E - Pm-l-l,l{c)] . 



m 



Proof. Prom ( j7-CLOSED| ) we have 



(m) 



m— 1 



fc=0 



(-ir 



m — \ — k 



E 



m — 1 — k 



k=0 



kl 



-1) m!c~ 

-l)^m!c~ 

-lYm\c~ 
-l)^m!c~ 



k / 1 T \ m—l—i I, / ^ , 

'm — 1 — /c\„ X - c I m — \ — k 



.fc=0 



l-k 



(-c)^(7n- 1 - k)\ 



k=0 
m— 1 — 



c'^(m — 1 — /c)! 
^_^k\{m-l-ey.{£-k)r ^ A:!(m-l-£-A:)!^! 



(m - 1)! 



(m- 1 - ^)!£! 
m — 1 



A (-c)^(m-l-fc)!£! ""^^ (m-l-A:)!(m-l-^)! 
^ A;!(m -!)!(£- A;)! " 2^ ^ 



fc=0 ' ' " ' fc=0 

[Qm-l-^/lc)-^ - (c)] . 



k\{m - l)\{m - 1 - i - k)\ 



□ 



D Properties of g 

Here we prove that g has the properties claimed in Section [3l We begin by proving a small identity. 



Lemma D.l. For all 1 < £ < m: 



(m + l)/3f ) = (m + 1 - £)/3r"^^ + {i + l)Ci 

Proof. The sequences 7^"^) and 7("^+^) satisfy ( j7-REcD and are related by ( j7-up( ), as shown in 
Lemma |3. 11 Multiplying the identity by £ and using the definition Z?!"*'' = k^lj^\ we obtain 



(m+l) 



.(m+l) 



(m + 1)7^™'^ = (jn + 1 



(m+l) 



(m+l) 



Applying ( j7-up| ) to each term on the right the desired identity is equivalent to 

(m + 1)7^^ = c-Hm + l)[{m+l- - 7^]) + ^(7^5 - 7^^) 

= c~\m + 1) [^7^:;^] + (m-2£+ 1)7^™) - {m - i + l)^'f_^l 

where we have used the recurrence ( j7-RECj ) in the final line. 



□ 
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D.l Monotonicity 

Lemma D.2. For any set S of size m < 2n and x ^ S, g{S) < g{S U {x}). 
Moreover, if f{T) = f{T U {x}) for each TCS then g{S) = g{S U {x}). 

Proof. Define for 1 < k < m, 



Pk 



Qk 



,(m+l) 



m + 1 J 



m + 1 ^(™) 



Lemma iD.ll implies that Pk + Qk = 1- Since 7^™) and 7^'"+^) are non-negative from Lemma l3.ll 
part (c), /3(™) and /3("^+i) are non-negative and thus < Pk,qk < 1 • The following identities will 
be useful: 



Pk 



Qk-i 



Pk 

(T) 
(.-1) 



^(rrt+i) m + 1 - A: 1 
Pk 



nim+l) 
Pk 



(m+1) 



-+1 (T) m' 

A; 1 



^(m+l) 
Pk 



m 



+ 1 (.-1) rr) 



We have 



\T\=k 



k=l 



< 



k=l yk) TCS 
\T\=k 

m. aim) 

EtW E iPkf{T) + qkf{TiJ{x})) 

k=l \k) TCS 
\T\=k 



Y^Pk 



[m) 



k=l \k) TCS 
\T\=k 

m+1 

(m.+l) 
k 

k=l 



+ 



m+1 



\k-l) 



k=2 



TCS 
|T|=fc-l 



+ /3^+^V(M) 



E/3^ 



\T\=k 



If /(T U {2;}) = f{T) for all T C S" then all the inequalities become tight. 



□ 
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D.2 Submodularity 

We first prove another small identity: 

o{m+l) 

^Pk I Pk+2 ^ Pk L Pfc 

Proof. Multiplying the desired identity by (in — k){m + 1) ('^) , we obtain the equivalent statement 

2[m - k){m + l)/?^™) +[k + l){k + 2)/?^™+'^ = m(m + + {m - k){m - k + l)/3^"+'). 

Applying Lemma (jP.ip to the left yields 

2(m - fc) (m + 1 - fc)/3f + 2(m - fc) (fc + l)/?!™^') + (A; + 1) (A: + 2)13^^+^^ . 
Applying Lemma (jP.ip twice to the right yields 

^m-k){m + l- k)/3t^^^ + im-k)ik + l)/3^++'^ + (fc + - A;)/3^++'^ 

+ (A: + + 2)/?^:;^+') + (jn - A;)(m + 1 - k)f3j^+^\ 

which is the same as the expression for the left. □ 

Lemma D.3. For any set S of size m — 1 < 2n — 1 and x ^ y ^ S , we have 

giS U {x}) + g{S U {y}) > g{S U {x, y}) + giS). 

Proof. Define pk, Qk for k satisfying 1 < /c < 2n — 1 as in the proof of Lemma [D. 2 [ As in the lemma, 
we have Pk + <lk = 1 and < pk, qk ^ 1- Then, the identity for qk from Lemma [D . 2 1 and identity ([1 
show that: 

oim) o{m) (m) n{m) o{m-l) n{n 

cyPk L „ Pk+l _ r)Pk , Pk+2 _ Pk , Pk 

^ fm\ T" yfc+l / m\ ~ ^ /■m\ "i" /m-l-1\ ~ /m.-^\ 



fm.\ ' 'iK+i f m \ " fm\ ' (m+l\ {'^-'^\ (m+l\ ' 

\kJ \k+l) \k) \k+2) \ k I \ k ) 

We have 

m— 1 aim) m 

g{S U {x}) + g{S U {y}) = E E + E E(/(^ ^ {x}) + f{T U {y})) 

k=l \k) TCS k=l ^k) TCS 

\T\=k \T\=k~l 

m~l n{rn) m n(m) 

^ E W E + E U {x}) + f{T U {y})) + qk{f{T) + /(T U {x, y}))] 

k=l \kJ TCS k=l ^k) TCS 

\T\=k |T|=A;-1 

m~l / o(m) oM \ „(m) 

= E Pt^ + '^^+it^ E/(^) + ^i7W/w 

fc=l V yk) \k+l) / TCS 

\T\=k 

m n{m) m+l o(m) 

+ T.P^%^ E(/(^ U {x}) + f{T U M)) + qk^lf^ Yl U {x, y}) 

k=l \k) TCS k=2 yk-l) TCS 

|T|=A:-1 |T|=fc-2 
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m-1 / oM \ "I aim) 

5^ 2^ + g.+if^ /(T) + J^p.^ ^^(/(T U {x}) + /(T U {y})) 
k=i \ yk) yk+i) / Tcs k=i yk) Tcs 

\T\=k |T|=fc-l 

m+1 aim) 

k=2 \k-l) TCS 

\T\=k~2 

m-l / n{m~l) n{m+l) \ m n{m+l) 

E + E /(^) + E E(/(^ u W) + /(T u M)) 

fc=l \ V fc / \ k I / TCS k=l V fc ) TCS 

\T\=k |T|=fc-l 

m+1 o(m+l) 

+ E7WE/(^u{x,y}) 

fc=2 V fc / TCS 
\T\=k-2 

g{S)+g{SU{x,y}). □ 



D.3 Curvature 

Lemma D.4. For any set S of size m < 2n and x ^ S, if f{T U {x}) > f{T) + (1 — c)f{{x}) for 
all TCS, 



g{S U {x}) > giS) + (1 - c')g{{x}), c' = c { 1 



/31 



(n+l) 



(n + l)f3\ 



(1) 



Proof. Lemma ID . 1 1 implies that for k satisfying 1 < k < m, 

Pk 



(m+1) o{m+l) r,(m) 
Pk+1 Pk 



+ 



(m+l\ ' (m+l\ (m\ ' 

\ k ) Kk+l) \k) 



Using this and Lemma IC.41 we have 



m+1 o(m+l) 

g{Si,x) = Y,^^ 



(m+l\ 
k=l \ k ) 



E/(^)+ E 



TCS 
_\T\=k 



TCS 
|T|=fc-l 



> 



m+1 o(m+l) 

EPk 
(m+l\ 
k=l \ k J 



E/(^)+ E (/(r) + (l-c)/({x})) 



TCS 
}T\=k 



TCS 
\T\=k~l 



m+1 



m / ^(m.+l) ^(m+1) \ 

E I (m+l\ + /m+l\ ) fC^) 
k=l \ \ k J \k+l) / TCS 

\T\=k 



m+l / m \ 
.k=l \ k ) 



m + 1 
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(m) 



III nyi 



k=l 



TCS 
\T\=k 



E '^Pk 
m + l 

.k=l 



+ 



(m+l) 



m + l 



-/(W) 



+ (1 - c)/3j^V( W) + 



(m+l) 



g{S) + (1 - c)5({x}) + 



c/31 



m + 

(m+l) 



(m + l)/3i 



(iy5'({2;}). 



□ 



Note that /? 



(m+l) 



> 0, and so c' < c. We can think of the construction of g from / as an 



operator G on the space of functions. The lemma shows that the curvature of G^^\f) (that is, G 
apphed t times in a row to /) tends to zero with t, and so in the limit we get a hnear function 
which necessarily depends only on the value of / on singletons. On the other hand, when m is 
large, ^ 1, and so c' c. 



E Sampling g 

Here we prove Lemma 13.31 Our analysis makes use of the following lemma, which is interesting in 
its own right. 

Lemma E.l. Let f be a non-negative suhmodular function, and let S he a set of size m. For k in 
the range 1 < k < m, define F(k) = K f{Xk), where Xk is a uniformly random subset of S of size 
k. Then 

Fik) > —F(m). 
m 

Proof. The proof is by induction on k. For A; = 1 we have 

mF(l) = ^/({x})>/(S), 

x&S 

from submodularity of /. Now, suppose that the claim is true for k — 1. Let 5 = {si, . . . ,Sm}- 
From submodularity we have 

m 

^ /({si, . . . , Sfc-i, St}) > {m - k)f{{si, Sk-i}) + f{S). 

t=k 

Averaging over all permutations of indices and then applying the induction hypothesis, we obtain 

{m-k + l)F{k) > (m - k)F{k - 1) + f{S) > (m - k)^^f{S) + f{S) = {m-k + 1) — □ 

m m 

Recall that we made use of the following random process for estimating g{S). Suppose that 
\S\ = m < 2n, and define am = Z^^Li l^k^^- We define the random set X using the following two 
step experiment. First, let L be a random variable taking value k with probability p'jT^^ /am, then 
choose X as a uniformly random subset of S of size L. Then, from the linearity of expectations we 
have am^fiX) = g{S). 

We need the following upper bound on Om- 
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Lemma E.2. If 2 < m < 2n then 

m 

«- = 5^/3i"^ <81ogm. 

k=l 

Proof. From Lemma l3.ll part (b), we must have 7^^^ non-decreasing. Using the exphcit formula 
( I7-CLOSEDI ) for 7^^) we find 7!^^'' = 2c~'^{cE - 1 - c) and 72^^ = 2c"^(l - (1 - c)E). Algebra gives 
E < (2 + c)/(2 — c) < 3. Additionally, because 7^™) is non-decreasing, we have 7^™^ < 7^^^ = E 
for all k < m + 1. Therefore, using m >2, 

m m im) m ^ 

»m = ^(3'r^ = Y,-^ ^^J2k- + 1) < 8 log m. (15) 

k=l k=l k=l □ 

We now prove IE. 11 showing that we can estimate g reasonably well by sampling from this 
random process. 

Lemma 3.3. Let S be a set of size m. Let N be a positive integer, and Xi, . . . ,Xn be N i.i.d. 
random samples drawn from the distribution for X. Define g = jj Yl!i=i o^mfiXi). For every e > 0, 



Pr[|5(5) -5(5)1 >e<7(5)] <2exp 



32 log^ m 



Proof. Prom Lemma |3. II part (b), we must have 7^^^ non-decreasing, hence 7^^^ > 1. From Lemma 
lE.H we have 

m m , m (m) 

g{S) = a^EfiX) = ^ E/(X,) > ^ -/^f V(5) = J] ^/(S) = l?f{S) > f{S), 

k=l k=l k=l 

(16) 

where the last equality follows from Lemma IC .41 

Since f{Xi) < f{S) for all i, Hoeffding's bound gives that Pr[|^(S') — g{S)\ > eg{S)] is at most: 

/ 2e^g(SfN\ f 2e^N\ ( 2e^N 

2exp PftTKT- < 2exp ^ < 2exp ^ 

V alJ{Sf ) V «m y V 641og2m, 

using Lemma TE. 2 1 □ 
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